A No-Compromises Architecture for Digital Document Preservation
نویسندگان
چکیده
The Multivalent Document Model offers a practical, proven, nocompromises architecture for preserving digital documents of potentially any data format. We have implemented from scratch such complex and currently important formats as PDF and HTML, as well as older formats including scanned paper, UNIX manual pages, TeX DVI, and Apple II AppleWorks word processing. The architecture, stable since its definition in 1997, extends easily to additional document formats, defines a cross-format document tree data structure that fully captures semantics and layout, supports full expression of a format's often idiosyncratic concepts and behavior, enables sharing of functionality across formats thus reducing implementation effort, can introduce new functionality such as hyperlinks and annotation to older formats that cannot express them, and provides a single interface (API) across all formats. Multivalent contrasts sharply with emulation and conversion, and advances Lorie's Universal Virtual Computer with high-level architecture and extensive implementation.
منابع مشابه
Learning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کاملAutomating the preservation planning process: An extensible evaluation framework for digital preservation
The dominance of digital objects in today’s information landscape has changed the way humankind creates and exchanges information. However, it has also brought an entirely new problem: the longevity of digital objects. Due to the fast changes in technologies, digital documents have a short lifespan before they become obsolete. Digital preservation, i.e. actions to ensure longevity of digital in...
متن کاملDrug Discovery Acceleration Using Digital Microfluidic Biochip Architecture and Computer-aided-design Flow
A Digital Microfluidic Biochip (DMFB) offers a promising platform for medical diagnostics, DNA sequencing, Polymerase Chain Reaction (PCR), and drug discovery and development. Conventional Drug discovery procedures require timely and costly manned experiments with a high degree of human errors with no guarantee of success. On the other hand, DMFB can be a great solution for miniaturization, int...
متن کاملترسیم نقشه دانش حوزه کتابخانههای دیجیتالی در ایران: تحلیل همرخدادی واژگان
This study aimed to knowledge mapping of Digital Libraries (DLs) field in Iran. This is a scientometrics study. In this regard, Social Network and co-word analysis methods were used. 554 research resources such as books, national and international journal papers, conferences articles, and MA and Ph.D. Theses in Iran up to 2013 were studied. Researcher made checklist was used to collext data. Al...
متن کاملRecognition of Sequence of Print and Ink Strokes: Investigation the Effect of Handwriting Pressure, Hue of Ink, Printer and Paper Type
By introducing of digital techniques, forensic document examiners has been encouraged to work with better accuracy in non-destructive ways. The aim of this study was to present a non-destructive, accessible, economic (affordable), user friendly, portable, useful and easy technique for specifying the order of crossing lines of ink stroke and printed text. The intersections of LaserJet and In...
متن کامل